A WordNet-based Algorithm for Word Sense Disambiguation

نویسندگان

  • Xiaobin Li
  • Stan Szpakowicz
  • Stan Matwin
چکیده

We present an algorithm for automatic word sense disambiguation ba.sed on lexical knowl edge contained in WordNet and on the results of surface-syntactic analvsis The algorithm lb part of a system that analyzes texts in or der to acquire knowledge in the presence of as little pre-coded semantic knowledge as possi blc On the other hand, we want lo make Hit besl us* of public-domain information sources such as WordNet Rather than depend on large amounts of hand-crafted knowledge or statistical data from large corpora, we use syntactic information and information in WordNet and minimize the need for other knowledg< sources in the word sense disambiguation process We propose to guide disambiguation by semantic similarity between words and heuris-tic rules based on this similarity I lu algorithm has been applied to the (anadian Income fax Guide Test results indicate that even on a rela-tivelv small text the proposed method produces correct noun meaning more than 11% of the tim< 1 Introduction This work is part of the project that amih at a svner gistic integration of Machine Learning and Natural Language Processing The long-t< rm goal of the project is a system that performs machine learning on the results of text analysis to acquire a useful collection of production rules Because such a svstem should not require extensive domain knowledge up front, text analysis is to be dont in a knowledge-scant setting and with mm imal user involvement A domain-independent surface-syntactic parser produces an analysis of a text fragment (usually a sentence) that undergoes interactive semantic 'Tins work was done while the first author was with the Knowledge Acquisition and Machine Learning group at the University of Ottawa The authors are gralcful to the Nat ural Sciences and Engineering Research Council of Canada far having supported this work with a Strategic Grant No STR0117764 interpretation By design we only need the user to approve the system s findings or prompt it for alternatives Also hy design we limit ourselves to information source* in the public domain inexpensive dictionaries and other lexical sources such as WordNet WordNel [Mill* r, 1990 Beckwith et al , 1991] is a very rich source of lexical knowledge Since most entries have multiple senses, we fare a severe problem of ambiguity The motivation for the work described here is the desire to design a word sense disambiguation (WSD) algorithm that satisfies the needs of our project …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrating WordNet and FrameNet using a Knowledge-based Word Sense Disambiguation Algorithm

This paper presents a novel automatic approach to partially integrate FrameNet and WordNet. In that way we expect to extend FrameNet coverage, to enrich WordNet with frame semantic information and possibly to extend FrameNet to languages other than English. The method uses a knowledge-based Word Sense Disambiguation algorithm for linking FrameNet lexical units to WordNet synsets. Specifically, ...

متن کامل

Word Sense Disambiguation for Cross-Language Information Retrieval

We have developed a word sense disambiguation algorithm, following Cheng and Wilensky (1997), to disambiguate among WordNet synsets. This algorithm is to be used in a cross-language information retrieval system, CINDOR, which indexes queries and documents in a language-neutral concept representation based on WordNet synsets. Our goal is to improve retrieval precision through word sense disambig...

متن کامل

SenseRelate: : TargetWord-A Generalized Framework for Word Sense Disambiguation

We have previously introduced a method of word sense disambiguation that computes the intended sense of a target word, using WordNet-based measures of semantic relatedness (Patwardhan et al., 2003). SenseRelate::TargetWord is a Perl package that implements this algorithm. The disambiguation process is carried out by selecting that sense of the target word which is most related to the context wo...

متن کامل

Random Walks for Knowledge-Based Word Sense Disambiguation

Word Sense Disambiguation (WSD) systems automatically choose the intended meaning of a word in context. In this article we present a WSD algorithm based on random walks over large Lexical Knowledge Bases (LKB). We show that our algorithm performs better than other graphbased methods when run on a graph built from WordNet and eXtended WordNet. Our algorithm and LKB combination compares favorably...

متن کامل

Exploring the Integration of WordNet and FrameNet

This paper presents a novel automatic approach to partially integrate FrameNet and WordNet. In that way we expect to extend FrameNet coverage, to enrich WordNet with frame semantic information and possibly to extend FrameNet to languages other than English. The method uses a knowledge-based Word Sense Disambiguation algorithm for matching the FrameNet lexical units to WordNet synsets. Specifica...

متن کامل

Improvements To Monolingual English Word Sense Disambiguation

Word Sense Disambiguation remains one of the most complex problems facing computational linguists to date. In this paper we present modification to the graph based state of the art algorithm In-Degree. Our modifications entail augmenting the basic Lesk similarity measure with more relations based on the structure of WordNet, adding SemCor examples to the basic WordNet lexical resource and final...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995